Skip to content

Dev#105

Merged
zTgx merged 10 commits intomainfrom
dev
Apr 22, 2026
Merged

Dev#105
zTgx merged 10 commits intomainfrom
dev

Conversation

@zTgx
Copy link
Copy Markdown
Contributor

@zTgx zTgx commented Apr 22, 2026

Summary

Changes

Checklist

  • Code compiles (cargo build)
  • Tests pass (cargo test --lib --all-features)
  • No new clippy warnings (cargo clippy --all-features)
  • Public APIs have documentation comments
  • Python bindings updated (if Rust API changed)

Notes

zTgx added 10 commits April 22, 2026 08:33
- Add detailed README.md documenting the full high-level Vectorless Python API
- Implement main.py demonstrating Session and SyncSession classes usage
- Cover 10 key topics including session creation, indexing, querying, and metrics
- Include sample documents for architecture, finance, and security domains
- Provide examples for both async and sync APIs with proper error handling
- Demonstrate event callbacks, document management, and streaming queries
…ve tests

Implement proper UTF-8 character boundary handling when truncating long
feedback strings to prevent panics with multi-byte characters like emojis
and Chinese characters. Replace unsafe byte-based slicing with
ceil_char_boundary method and add extensive test coverage for various
UTF-8 scenarios including ASCII, multi-byte characters, emojis, and
short strings.

fix(engine): improve document loading error handling during graph rebuild

Enhance error reporting and handling when loading documents for graph
rebuilding by tracking individual document failures, adding detailed
warnings for missing or inaccessible documents, and providing more
granular statistics about successful vs failed loads.

refactor(understand): improve JSON parsing robustness and error messages

Update JSON extraction from LLM responses to properly handle code
fences with language tags and missing closing fences. Add detailed
warning logs for parsing failures and fix edge cases where JSON keys
start with fence identifier letters ('j', 's', 'o', 'n').

fix(dedup): handle None document names correctly in evidence deduplication

Ensure proper deduplication when Evidence objects have None for doc_name
by using "_unknown" placeholder, preventing incorrect deduplication
between documents with explicit names and those without.

perf(cache): optimize cache performance with VecDeque and poison recovery

Replace Vec with VecDeque for O(1) LRU eviction operations, reducing
cache maintenance overhead. Add poison lock recovery mechanism to
maintain cache availability when worker threads panic, preventing
silent failures and ensuring continued operation with stale data
instead of blocking access.
Add comprehensive test example that indexes a realistic technical
document and asks complex questions requiring deep reasoning across
document sections, demonstrating the engine's capability beyond
simple keyword matching.

fix: update README tagline by removing synthesis reference

Remove "Exact, not synthesized" from the header tagline in README.md
to better align with current project focus and messaging.

refactor: enhance logging across core engine components

Add detailed logging information throughout the engine including:
- Evidence evaluation counts in orchestrator
- Replanning evidence metrics
- Navigation planning and rounds tracking
- Index persistence status updates
- Query understanding initiation
- Dispatcher operation flow

Replace generic log messages with structured logging containing
relevant context like document names, round numbers, and operation
metrics for better debugging and monitoring.
- Configure tracing subscriber with environment filter support,
  allowing log level to be controlled via RUST_LOG environment variable
- Add document resolution count logging to track query processing
- Add document loading statistics showing loaded and failed counts
…vigation

Support relative paths with "/" separator in cd command (e.g.,
"Research Labs/Lab B") alongside existing absolute paths. Update
navigation prompts to clarify path support including both relative
paths like "Section/Sub" and absolute paths like "/root/Section".

Add comprehensive tests for relative path navigation scenarios
including success cases and partial failure handling.

refactor(index): extract keywords from full content instead of samples

Always extract keywords from full node content rather than falling
back to content samples when summaries are empty. This ensures more
comprehensive keyword coverage across documents.

feat(query): enhance query understanding with detailed logging

Include key concepts, strategy hints, and rewritten queries in
understanding logs for better debugging and visibility into query
processing decisions.

feat(search): add content snippets to search results for relevance

Include content snippets around matching keywords in search results
to help users judge relevance. Add new content_snippet utility
function that extracts context-aware text fragments centered on
keywords with configurable length limits and proper UTF-8 boundary
handling. Apply this enhancement to find_cross, worker execution,
and planning components.
…ality

- Increase max_rounds from 8 to 15 and max_llm_calls from 15 to 25
- Update find command to support multi-word searches and provide better
  fallback behavior for title matching
- Enhance search strategy documentation with navigation efficiency
  guidelines
- Update all test cases to reflect new max_rounds value of 15
- Improve find command output to include content snippets when available
Add BFS-based deep search functionality to resolve_target_extended
that searches up to 4 levels deep for matching node titles. The new
search hierarchy prioritizes: 1) Direct children via NavigationIndex,
2) Direct children via TreeNode titles, and 3) Deep descendant search
with breadth-first traversal. Also include comprehensive test coverage
for the new deep search functionality.

refactor(agent): improve evidence formatting with content previews

Replace character count displays with actual content excerpts in
evidence summaries for both evaluation and replanning phases. Content
is truncated to 500 characters to maintain manageable prompt sizes.
Update format_evidence_summary and format_evidence_context functions
to show meaningful content previews instead of just character counts.

feat(agent): track collected nodes separately from visited nodes

Introduce collected_nodes HashSet to distinguish between nodes that
have been visited during navigation versus nodes whose content has
been specifically collected via cat operations. Add has_evidence_for
method to check collection status and evidence_for_check method to
provide content-excerpt based evidence summaries for sufficiency
checks.
- Remove character limit truncation from evidence content in evaluation
- Allow full content to be available for LLM assessment of relevance
- Increase MAX_FEEDBACK_CHARS from 500 to 2000 to prevent prompt bloat
  while maintaining useful context

fix(logging): add compact formatting to tracing subscriber
- Create new supervisor module to encapsulate the dispatch → evaluate →
  replan logic
- Replace inline supervisor loop implementation with call to
  run_supervisor_loop function
- Add SupervisorOutcome struct to return iteration count, evaluation
  sufficiency status, and LLM call counts
- Maintain same functionality while improving code organization and
  testability

refactor(worker): extract navigation loop into separate module

- Move navigation loop logic from worker module to new navigation module
- Replace inline navigation loop with run_navigation_loop function call
- Split complex navigation logic into smaller helper functions for
  building prompts, handling parsing failures, and managing replanning
- Improve code organization and maintainability

feat(tools): remove content truncation in cat tool

- Remove character limit and truncation logic from cat tool output
- Return full content string instead of truncated preview
- This allows complete evidence collection without size limitations
…nippet logic

BREAKING CHANGE: Removed MAX_FEEDBACK_CHARS constant and automatic
truncation in set_feedback method. Feedback will now be stored as-is
without size limitations.

- Moved content_snippet function to tools module for shared usage
- Updated all references to use the centralized content_snippet function
- Increased snippet length from 150/120 to 300 characters for better context
- Replaced character limit checks with entry count limits in planning
- Added MAX_PLAN_ENTRIES (15), MAX_SECTION_SUMMARIES (10), and
  MAX_EXPANSION_ENTRIES (8) constants for better control over prompt size
- Removed content preview truncation in grep tool
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 22, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
vectorless Ready Ready Preview, Comment Apr 22, 2026 7:26am

@zTgx zTgx merged commit fb28c3a into main Apr 22, 2026
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant